Statistical Regularization and Qualitative Constraints
نویسندگان
چکیده
Comprehensive characterization of a proteome defines a fundamental goal in proteomics. In order to maximize proteome coverage for a complex protein mixture, i.e. to identify as many proteins as possible, various different fractionation experiments are typically performed and the individual fractions are subjected to mass spectrometric analysis. The resulting data are integrated into large and heterogeneous datasets. Proteome coverage prediction refers to the task of extrapolating the number of protein discoveries by future measurements conditioned on a sequence of already performed measurements. Proteome coverage prediction at an early stage enables experimentalists to design and plan efficient proteomics studies. To date, there does not exist any method that reliably predicts proteome coverage from integrated datasets. We present a generalized hierarchical Pitman-Yor process model that explicitly captures the redundancy within integrated datasets. We assess the proteome coverage prediction accuracy of our approach applied to an integrated proteomics dataset for the bacterium L. interrogans and we demonstrate that it outperforms ad hoc extrapolation methods and prediction methods designed for non-integrated datasets. Furthermore, we estimate the maximally achievable proteome coverage for the experimental setup underlying the L. interrogans dataset. We discuss the implications of our results to determine rational stop criteria and their influence on the design of efficient and reliable proteomics studies.
منابع مشابه
DFG-SNF Research Group FOR916 Statistical Regularization and Qualitative Constraints
We generalize a theorem of Shao (1995, Proc. Am. Math. Soc. 123, 575-582) on the almost-sure limiting behavior of the maximum of standardized random walk increments to multidimensional arrays of i.i.d. random variables. The main difficulty is the absence of an appropriate strong approximation result in the multidimensional setting. The multiscale statistic under consideration was used recently ...
متن کاملApplication of Network RTK Positions and Geometric Constraints to the Problem of Attitude Determination Using the GPS Carrier Phase Measurements
Nowadays, navigation is an unavoidable fact in military and civil aerial transportations. The Global Positioning System (GPS) is commonly used for computing the orientation or attitude of a moving platform. The relative positions of the GPS antennas are computed using the GPS code and/or phase measurements. To achieve a precise attitude determination, Carrier phase observations of GPS requiring...
متن کاملStatistical Regularization and Qualitative Constraints
The Augmented Lagrangian Method as an approach for regularizing inverse problems received much attention recently, e.g. under the name Bregman iteration in imaging. This work shows convergence (rates) for this method when Morozov’s discrepancy principle is chosen as a stopping rule. Moreover, error estimates for the involved sequence of subgradients are pointed out. The paper studies implicatio...
متن کاملQualitative Assumptions and Regularization in High-Dimensional Statistics
Important and exciting developments are currently underway in nonparametric statistics involving inter-play between qualitative constraints, penalization, and regularization methods. Some of these developments are taking place on the theoretical side (with connections in the direction of empirical process theory), while other parts of the development are occurring on the algorithmic and approxi...
متن کاملStatistical Significance Based Graph Cut Regularization for Medical Image Segmentation
Graph cut minimization formulates the image segmentation as a linear combination of problem constraints. The salient constraints of the computer vision problems are data and smoothness which are combined through a regularization parameter. The main task of the regularization parameter is to determine the weight of the smoothness constraint on the graph energy. However, the difference in functio...
متن کاملImproved statistical power of the multilinear reference tissue approach to the quantification of neuroreceptor ligand binding by regularization.
A multilinear reference tissue approach has been widely used recently for the assessment of neuroreceptor-ligand interactions with positron emission tomography. The authors analyzed this "multilinear method" with respect to its sensitivity to statistical noise, and propose regularization procedures that reduce the effects of statistical noise. Computer simulations and singular value decompositi...
متن کامل